Query Classification System Based on Snippet Summary Similarities for NTCIR-10 1CLICK-2 Task

نویسندگان

  • Tatsuya Tojima
  • Takashi Yukawa
چکیده

A query classification system for NTCIR-10 1CLICK-2 is described in this paper. The system classifies queries in Japanese and English into eight predefined classes by using support vector machines (SVMs) for classification. Feature vectors are created based on snippet similarities instead of snippet word frequency. These vectors, which have fewer dimensions than those made from raw words, reduce the number of parameters of SVMs. Therefore, the system achieves more generalization and reduces computing resources. Two methods for calculating document similarity, cosine similarity and Jaccard index, were compared. Additionally, two snippet sources, Bing search results given by the task organizer and Yahoo! Japan Web search results, were compared. Other methods that add query string information to snippet information for the feature vectors were compared with the above methods. Our system achieved 0.89 accuracy in the English task by cosine similarity and the Yahoo! Japan Web search results, and 0.86 in the Japanese task by cosine similarity and the Bing search results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Query Classification System based on Snippet Similarity for a One-Click Search

This paper proposes a query classification system for a one-click search system that uses feature vectors based on snippet similarity. The proposed system targets the NTCIR-10 1CLICK-2 query classification subtask and classifies queries in Japanese and English into eight predefined classes by using support vector machines (SVMs). In the NTCIR-9 and NTCIR-10 tasks, most participants used complex...

متن کامل

Overview of the NTCIR-10 1CLICK-2 Task

This is an overview of the NTCIR-10 1CLICK-2 task (the second One Click Access task). Given a search query, 1CLICK aims to satisfy the user with a single textual output instead of a ranked list of URLs. Systems are expected to present important pieces of information first and to minimize the amount of text the user has to read. We designed English and Japanese 1CLICK tasks, in which 10 research...

متن کامل

TTOKU Summarization Based Systems at NTCIR-10 1CLICK-2 task

We describe our query-oriented summarization system implemented for the NTCIR-10 1CLICK-2 task. Our system is purely based on a summarization method regarding the task as a summarization process. The system calculates relevant scores of terms for a given query, then extracts relevant part of sentences from input sources. For the calculation of relevant scores for a query, we employed a Query Sn...

متن کامل

MSRA at NTCIR-10 1CLICK-2

We describe Microsoft Research Asia’s approaches to the NTCIR-10 1CLICK-2 task. We construct the system based on some heuristic rules, and change the setting of our approaches to test the effectiveness of each setting. The evaluation results show the effectiveness of the query attributes.

متن کامل

Hunter Gatherer: UdeM at 1CLICK-2

We describe our hunter-gartherer system for the NTCIR10 1CLICK-2 task. We inspire ourselves on the DeepQA framework looking to adapt it for the 1CLICK task. Several techniques can be integrated naturally in this framework. The hunter component generates candidates based on the passage retrieval for the original query, the gartherer component collects evidence for each candidate and score them b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013